43 research outputs found

    Biasing MCTS with Features for General Games

    Get PDF
    This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.Comment: Accepted at IEEE CEC 2019, Special Session on Games. Copyright of final version held by IEE

    Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates

    Get PDF
    In recent years, state-of-the-art game-playing agents often involve policies that are trained in self-playing processes where Monte Carlo tree search (MCTS) algorithms and trained policies iteratively improve each other. The strongest results have been obtained when policies are trained to mimic the search behaviour of MCTS by minimising a cross-entropy loss. Because MCTS, by design, includes an element of exploration, policies trained in this manner are also likely to exhibit a similar extent of exploration. In this paper, we are interested in learning policies for a project with future goals including the extraction of interpretable strategies, rather than state-of-the-art game-playing performance. For these goals, we argue that such an extent of exploration is undesirable, and we propose a novel objective function for training policies that are not exploratory. We derive a policy gradient expression for maximising this objective function, which can be estimated using MCTS value estimates, rather than MCTS visit counts. We empirically evaluate various properties of resulting policies, in a variety of board games.Comment: Accepted at the IEEE Conference on Games (CoG) 201

    Using Local Search to Find \MSSes and MUSes

    Get PDF
    International audienceIn this paper, a new complete technique to compute Maximal Satisfiable Subsets (MSSes) and Minimally Unsatisfiable Subformulas (MUSes) of sets of Boolean clauses is introduced. The approach improves the currently most efficient complete technique in several ways. It makes use of the powerful concept of critical clause and of a computationally inexpensive local search oracle to boost an exhaustive algorithm proposed by Liffiton and Sakallah. These features can allow exponential efficiency gains to be obtained. Accordingly, experimental studies show that this new approach outperforms the best current existing exhaustive ones

    L’utilisation de l’échographie pulmonaire dans la prise en charge des patients de soins critiques

    Full text link
    En démontrant sa capacité d’identifier les pneumothorax, de différencier les différentes causes d’insuffisance respiratoire chez les patients dyspnéiques et de confirmer la position d’un tube endotrachéal lors d’une intubation endotrachéale, l’échographie pulmonaire a pris une place prépondérante dans la prise en charge des patients de soins critiques. La majorité des études, notamment celles sur l’intubation endotrachéale, ont évalué la performance de cliniciens possédant une expérience considérable en échographie pulmonaire et souvent dans un cadre idéal permettant des examens d’une durée prolongée. Considérant la disponibilité grandissante de l’échographie ciblée lors des situations de stabilisation et de réanimation des patients de soins critiques, nous voulions évaluer la capacité d’un groupe de clinicien hétérogène en termes de formation échographique à identifier la présence ou l’absence de glissement pleural sur de courtes séquences (comparable à la durée probable d’un examen lors de condition de réanimation) d’échographie pulmonaire enregistrées chez des patients intubés. Un total de 280 courtes séquences (entre 4 et 7 secondes) d’échographie pulmonaire démontrant la présence ou l’absence de glissement pleural chez des patients intubés en salle d’opération ont été enregistrées puis présentées de façon aléatoire à deux groupes de cliniciens en médecine d’urgence. Le deuxième groupe avait la possibilité de s’abstenir advenant une incertitude de leur réponse. Nous avons comparé la performance selon le niveau de formation académique et échographique. Le taux moyen d’identification adéquate de la présence ou l’absence du glissement pleural par participant était de 67,5% (IC 95% : 65,7-69,4) dans le premier groupe et 73,1% (IC 95% : 70,7-75,5) dans le second (p<0,001). Le taux médian de réponse adéquate pour chacune des 280 séquences était de 74,0% (EIQ : 48,0-90,0) dans le premier groupe et 83,7% (EIQ : 53,3-96,2) dans le deuxième (p=0,006). Le taux d’identification adéquate de la présence ou absence d’un glissement pleural par les participants des deux groupes était nettement supérieur pour les séquences de l’hémithorax droit par rapport à celles de l’hémithorax gauche (p=0,001). Lorsque des médecins de formation académique et échographique variable utilisent de courtes séquences d’échographie pulmonaire (plus représentatives de l’utilisation réelle en clinique), le taux d’identification adéquate de la présence ou l’absence de glissement pleural est plus élevé lorsque les participants ont la possibilité de s’abstenir en cas de doute quant à leur réponse. Le taux de bonnes réponses est également plus élevé pour les séquences de l’hémithorax droit, probablement dû à la présence sous-jacente du cœur à gauche, la plus petite taille du poumon gauche et l’effet accru du pouls pulmonaire dans l’hémithorax gauche. Considérant ces trouvailles, la prudence est de mise lors de l’utilisation de l’identification du glissement pleural sur de courtes séquences échographique comme méthode de vérification de la position d’un tube endotrachéal lors d’une intubation endotrachéale, et ce, particulièrement pour l’hémithorax gauche. Aussi, une attention particulière devrait être mise sur la reconnaissance du pouls pulmonaire lors de l’enseignement de l’échographie pulmonaire.The field of targeted lung ultrasound in critical care is in constant expansion. Its many proven use include pneumothorax diagnosis, differentiation of the different causes of acute dyspnoea and endotracheal intubation confirmation. These studies on endotracheal intubation evaluated sonographers with extensive ultrasound training using sometimes lengthy exam. Hence, with the growing presence of bedside lung ultrasound we devised a study to evaluate the capacity of a heterogeneous group of physicians, with different levels of ultrasound training, to correctly identify lung sliding on random short sequences of recorded thoracic ultrasound. 280 short ultrasound sequences (4 to 7 seconds) of present and absent lung sliding of intubated patients recorded in the operating room were randomly presented to 2 groups of physicians. Descriptive data, mean accuracy of each participant, as well as the rate of correct answers for each of the sequences was measured and compared for different subgroups. Participants in the second group where instructed that they could abstain from answering in uncertain cases. Mean accuracy was 67.5% (95%CI: 65.7-69.4) in the first group and 73.1% (95%CI: 70.7-75.5) in the second (p<0.001). When considering each sequence individually, median accuracy was 74.0% (IQR: 48.0-90.0) in the first group and 83.7% (IQR: 53.3-96.2) in the second (p=0.006). The rate of correct answer was higher for right hemithorax sequences (p=0.001). Accuracy in lung sliding identification is better when participants have the possibility to abstain themselves from answering in uncertain cases. It is also improved in the right hemithorax, probably owing to the presence of the heart and the lung pulse artefact in the left hemithorax. Considering our results, caution should be taken when using short ultrasound sequences for identifying lung sliding as a mean of confirming endotracheal intubation, particularly in the left hemithorax. Emphasis should also be put on knowledge and identification of the Lung pulse artefact when teaching chest ultrasound curriculum

    Measuring Board Game Distance

    Get PDF
    This paper presents a general approach for measuring distances between board games within the Ludii general game system. These distances are calculated using a previously published set of general board game concepts, each of which represents a common game idea or shared property. Our results compare and contrast two different measures of distance, highlighting the subjective nature of such metrics and discussing the different ways that they can be interpreted

    Ludii -- The Ludemic General Game System

    Full text link
    While current General Game Playing (GGP) systems facilitate useful research in Artificial Intelligence (AI) for game-playing, they are often somewhat specialised and computationally inefficient. In this paper, we describe the "ludemic" general game system Ludii, which has the potential to provide an efficient tool for AI researchers as well as game designers, historians, educators and practitioners in related fields. Ludii defines games as structures of ludemes -- high-level, easily understandable game concepts -- which allows for concise and human-understandable game descriptions. We formally describe Ludii and outline its main benefits: generality, extensibility, understandability and efficiency. Experimentally, Ludii outperforms one of the most efficient Game Description Language (GDL) reasoners, based on a propositional network, in all games available in the Tiltyard GGP repository. Moreover, Ludii is also competitive in terms of performance with the more recently proposed Regular Boardgames (RBG) system, and has various advantages in qualitative aspects such as generality.Comment: Accepted at ECAI 202

    General Board Game Concepts

    Get PDF
    Many games often share common ideas or aspects between them, such as their rules, controls, or playing area. However, in the context of General Game Playing (GGP) for board games, this area remains under-explored. We propose to formalise the notion of "game concept", inspired by terms generally used by game players and designers. Through the Ludii General Game System, we describe concepts for several levels of abstraction, such as the game itself, the moves played, or the states reached. This new GGP feature associated with the ludeme representation of games opens many new lines of research. The creation of a hyper-agent selector, the transfer of AI learning between games, or explaining AI techniques using game terms, can all be facilitated by the use of game concepts. Other applications which can benefit from game concepts are also discussed, such as the generation of plausible reconstructed rules for incomplete ancient games, or the implementation of a board game recommender system

    Extracting tactics learned from self-play in general games

    Get PDF
    Local, spatial state-action features can be used to effectively train linear policies from self-play in a wide variety of board games. Such policies can play games directly, or be used to bias tree search agents. However, the resulting feature sets can be large, with a significant amount of overlap and redundancies between features. This is a problem for two reasons. Firstly, large feature sets can be computationally expensive, which reduces the playing strength of agents based on them. Secondly, redundancies and correlations between features impair the ability for humans to analyse, interpret, or understand tactics learned by the policies. We look towards decision trees for their ability to perform feature selection, and serve as interpretable models. Previous work on distilling policies into decision trees uses states as inputs, and distributions over the complete action space as outputs. In contrast, we propose and evaluate a variety of decision tree types, which take state-action pairs as inputs, and provide various different types of outputs on a per-action basis. An empirical evaluation over 43 different board games is presented, and two of those games are used as case studies where we attempt to interpret the discovered features
    corecore